Axioms for Rational Reinforcement Learning

نویسندگان

  • Peter Sunehag
  • Marcus Hutter
چکیده

We provide a formal, simple and intuitive theory of rational decision making including sequential decisions that affect the environment. The theory has a geometric flavor, which makes the arguments easy to visualize and understand. Our theory is for complete decision makers, which means that they have a complete set of preferences. Our main result shows that a complete rational decision maker implicitly has a probabilistic model of the environment. We have a countable version of this result that brings light on the issue of countable vs finite additivity by showing how it depends on the geometry of the space which we have preferences over. This is achieved through fruitfully connecting rationality with the Hahn-Banach Theorem. The theory presented here can be viewed as a formalization and extension of the betting odds approach to probability of Ramsey and De Finetti [Ram31, deF37].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Non-Rational Discrete Choice Based On Q-Learning And The Prospect Theory

When modelling human discrete choice the standard approach is to adopt the rational model. This has been shown, however, to fail systematically under some conditions, which makes evident the need for a better approach. The choice model is however only part of the problem because it does not say how to deal with uncertainty, where learning is necessary. In this regard, some evidences support the...

متن کامل

Rationality, optimism and guarantees in general reinforcement learning

In this article, we present a top-down theoretical study of general reinforcement learning agents. We begin with rational agents with unlimited resources and then move to a setting where an agent can only maintain a limited number of hypotheses and optimizes plans over a horizon much shorter than what the agent designer actually wants. We axiomatize what is rational in such a setting in a manne...

متن کامل

Can I Do That? Discovering Domain Axioms Using Declarative Programming and Relational Reinforcement Learning

Robots deployed to assist humans in complex, dynamic domains need the ability to represent, reason with, and learn from, different descriptions of incomplete domain knowledge and uncertainty. This paper presents an architecture that integrates declarative programming and relational reinforcement learning to support cumulative and interactive discovery of previously unknown axioms governing doma...

متن کامل

On characterizations of the fully rational fuzzy choice functions

In the present paper, we introduce the fuzzy Nehring axiom, fuzzy Sen axiom and weaker form of the weak fuzzycongruence axiom. We establish interrelations between these axioms and their relation with fuzzy Chernoff axiom. Weexpress full rationality of a fuzzy choice function using these axioms along with the fuzzy Chernoff axiom.

متن کامل

A Modular On-line Profit Sharing Approach in Multiagent Domains

How to coordinate the behaviors of the agents through learning is a challenging problem within multi-agent domains. Because of its complexity, recent work has focused on how coordinated strategies can be learned. Here we are interested in using reinforcement learning techniques to learn the coordinated actions of a group of agents, without requiring explicit communication among them. However, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011